Biostatistics For Dummies (Monika Wahi John Pezzullo)

the cells in the table have large counts, but it becomes unreliable when one or more cell counts is

very small (or zero). There are different recommendations as to the minimum counts you need per

cell in order to confidently use the chi-square test. A rule of thumb that many analysts use is that

you should have at least five observations in each cell of your table (or better yet, at least five

expected counts in each cell).

It’s not good at detecting trends. The chi-square test isn’t good at detecting small but steady

progressive trends across the successive categories of an ordinal variable (see Chapter 4 if you’re

not sure what ordinal is). It may give a significant result if the trend is strong enough, but it’s not

designed specifically to work with ordinal categorical data. In those cases, you should use a

Mantel-Haenszel chi-square test for trend, which is outside the scope of this book.

Modifying the chi-square test: The Yates continuity correction

There is a little drama around the original Pearson chi-square of association test that needs to be

mentioned here. Yates, who was a contemporary of Pearson, developed what is called the Yates

continuity correction. Yates argued that in the special case of the fourfold table, adding this correction

results in more reliable p values. The correction consists of subtracting 0.5 from the magnitude of the (

) difference before squaring it.

Let’s apply the Yates continuity correction for your analysis of the sample data in the earlier section

“Understanding how the chi-square test works.” Take a look at Figure 12-3, which has the differences

between the values in the observed and expected cells. The application of the Yates correction changes

the 7.20 (or –7.20) difference in each cell to 6.70 (or –6.70). This lowers the chi-square value from

8.81 down to 7.63 and increases the p value from 0.0030 to 0.0057, which is still very significant —

the chance of random fluctuations producing such an apparent effect in your sample is only about 1 in

175 (because

Even though the Yates correction to the Pearson chi-square test is only applicable to the

fourfold table (and not tables with more rows or columns), some statisticians feel the Yates

correction is too strict. Nevertheless, it has been automatically built into statistical software like

R, so if you run a Pearson chi-square using most commercial software, it automatically uses the

Yates correction when analyzing a fourfold table (see Chapter 4 for a discussion of statistical

software).

Focusing on the Fisher Exact Test

The Pearson chi-square test described earlier isn’t the only way to analyze cross-tabulated data.

Remember that one of the cons was that it is not an exact test? Famous but controversial statistician R.

A. Fisher invented another test in the 1920s that gives the exact p value for tables that can handle very

small cell counts (even cell counts of zero!). Not surprisingly, this test is called the Fisher Exact test

(also sometimes referred to Fisher’s exact test, or just Fisher).

Understanding how the Fisher Exact test works

Like with the chi-square, you don’t have to know the details of the Fisher Exact test to use it. If you